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ABSTRACT 

Understanding the clustering of galaxies has long been a goal of modern observational cosmology. 
Redshift surveys have been used to measure the correlation length as a function of luminosity and color. 
However, when subdividing the catalogs into multiple subsets, the errors increase rapidly. Angular 
clustering in magnitude-limited photometric surveys has the advantage of much larger catalogs, but 
suffers from a dilution of the clustering signal due to the broad radial distribution of the sample. 
Also, up to now it has not been possible to select uniform subsamples based on physical parameters, 
like luminosity and rest-frame color. Utilizing our photometric redshift technique a volume limited 
sample (0.1<z<0.3) containing more than 2 million galaxies is constructed from the SDSS galaxy 
catalog. In the largest such analysis to date, we study the angular clustering as a function of luminosity 
and spectral type. Using Limber's equation we calculate the clustering length for the full data set as 
ro = 5.77±0.10 h~ 1 Mpc. We find that ro increases with luminosity by a factor of 1.6 over the sampled 
luminosity range, in agreement with previous redshift surveys. We also find that both the clustering 
length and the slope of the correlation function depend on the galaxy type. In particular, by splitting 
the galaxies in four groups by their rest-frame type we find a bimodal behavior in their clustering 
properties. Galaxies with spectral types similar to elliptical galaxies have a correlation length of 
6.59 ± 0.17 ft, _1 Mpc and a slope of the angular correlation function of 0.96 ± 0.05 while blue galaxies 
have a clustering length of 4.51 ± 0.19 h~ 1 Mpc and a slope of 0.68 ± 0.09. The two intermediate color 
groups behave like their more extreme 'siblings', rather than showing a gradual transition in slope. 
We discuss these correlations in the context of current cosmological models for structure formation. 
Subject headings: galaxies: clusters — galaxies: evolution — galaxies: distances and redshifts - 

galaxies: photometry — cosmology: observations — general: large-scale structure 

of the Universe — methods: statistical 



1. INTRODUCTION 

One of the primary tools for studying the evolution and 
formation of structure within t he universe has been th e 
angular correlation function l)Totsuii fc Kiharal 11969). 
Possibly the simplest of these point process statistics is 
the angular 2-point function which measures the excess 
number of pairs of galaxies, as a function of separation, 
when compared to a random distribution. If the uni- 
verse can be described by a Gaussian random process 

1 Department of Physics and Astronomy, The Johns Hopkins 
University, 3701 San Martin Drive, Baltimore, MD 21218, USA 

2 Department of Physics and Astronomy, University of Pitts- 
burgh, Pittsburgh, PA 15260, USA 

3 Institute for Astronomy, University of Hawaii, Honolulu, HI 
96822, USA 

4 Department of Physics, Eotvos University, Budapest, Pf. 32, 
Hungary, H-1518 

5 Princeton University Observatory, Princeton, NJ 08544, USA 

6 Apache Point Observatory, P.O. Box 59, Sunspot, NM 88349, 
USA 

7 Steward Observatory, 933 N. Cherry Ave., Tucson, AZ 85721 

8 Astronomy and Astrophysics Department, University of 
Chicago, Chicago, IL 60637, USA 

9 Fermi National Accelerator Labroratory, P.O. Box 500, 
Batavia, IL 60510, USA 

10 Institute for Cosmic Ray Research, University of Tokyo, 5-1-5 
Kashiwa, Kashiwa City, Chiba 277-8582, Japan 

11 Astronomy Centre, University of Sussex, Falmer, Brighton, 
BN1 9QJ, UK 

12 Department of Physics, University of Pennsylvania, Philadel- 
phia, PA 19101, USA 



then the 2-point function will fully describe the cluster- 
ing of galaxies. While this is clearly not the case and 
higher order correlation functions play a significant role 
in understanding the clustering of structure and its evo- 
lution, the 2-point function remains an important sta- 
tistical tool. In this paper we will utilize the angular 
2-point function to determine the type and luminosity 
dependence of the cl ustering of galaxies with in the Sloan 
Digital Sky Survey lISDSS: York et ai]l2000T) . 

Studying the angular correlation function has a nat- 
ural advantage over the spatial or redshift correlation 
function. By only requiring positional information we 
can derive the clustering signal from photometric sur- 
veys alone (i.e. without requiring spectroscopic followup 
observations). Given the relative efficiencies of pho- 
tometric surveys over their spectroscopic counterparts, 
this enables the correlation function to be estimated for 
wide-angle surveys covering statistically representative 
volumes of the universe and without being limited by 
discreteness error. The disadvantage of the angular cor- 
relation function has been that it is the projection of the 
spatial correlation function over the redshift distribution 
of the galaxy sample. For bright magnitude limits the 
redshift distribution is well known (and relatively nar- 
row) and therefore deprojecting the angular clustering 
to estimate the clustering length is relatively straight- 
forward. At fainter magnitudes the redshift distribution 
becomes broader and the details of the clustering signal 
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can be washed out. 

We can overcome many of the disadvantages of the an- 
gular clustering if we utilize photometric redshifts. Pho- 
tometric redshifts provide a statistical estimate of the 
rcdshift, luminosity and type of a galaxy based on its 
broadband colors. As we can control the redshift interval 
from which we select the galaxies (and the distribution 
of galaxy types and luminosities) we can determine how 
the clustering signal evolves with redshift and invert it 
accurately to estimate the real space clustering length, 
ro, for galaxies. The ability to utilize large, multicolor 
photometric surveys as opposed to the smaller spectro- 
scopic samples means that we can subdivide the galaxy 
distributions by luminosity and type without being lim- 
ited by the size of the resulting subsample (i.e. most of 
our analyzes will not be limited by shot noise). As we 
expect the dependence of the clustering signal to vary 
smoothly with luminosity, type and redshift, it is not ex- 
pected that the statistical uncertainties in the redshift 
estimates will significantly bias our resulting measures. 

The utility of photometric redshifts for measur- 
ing the clustering of galaxies as a function of 
redshift and type has been re cognized when study- 
ing high redshift galaxi es iArnouts et all 119991: 
Brunncr, Szalav & Connollv i20ont Connollv ct al. 
19981 iFirth et all 120021 iMariiocchetti fc Maddoxl 119991: 
Roukema et ailll999UTepritz et aIll2001lL In this Daper 
we focus on studying the angular clustering of interme- 
diate redshift galaxies z < 0.3 and the dependence of the 
clustering length on luminosity and galaxy type. This 
represents one of the first applications of the photomet- 
ric redshifts to the clustering of intermediate redshift 
galaxies for which we have a large, homogeneously and 
statistically significant sample of galaxies. This paper 
is divided into five sections. In Section 2 we describe 
the data set used in this analysis and the selection of a 
volume limited sample of galaxies. In Section 3 we apply 
a novel approach for estimating the 2-point angular 
correlation function using Fast Fourier Transforms and 
we show the dependence of the slope and amplitude of 
the correlation function on luminosity and galaxy type. 
In Section 4 we invert the projected angular correlation 
function and derive the correlation lengths. In Section 5 
we discuss the bimodal behavior of clustering properties. 

2. DEFINING A PHOTOMETRIC SAMPLE 

The SDSS is a photometric and spectroscopic sur- 
vey designed to map the distribution of stars and 
galaxies in the loc al and intermediate redshift universe 
(SPSS: York et aJ.ll200fl . On completion the SDSS will 
have observed the majority of the northern sky (~ n 
steradians) and approximately 1000 square degrees in the 
southern hemisphere. These observations are undertaken 
in a drift-scan mode where a dedicated 2.5m telescope 
scans along great circles, imaging 2.5 degree wide stripes 
of the s ky. The imaging data is taken using a mosaic 
camera llGunn et aJJfl99 8') through the five photomet- 
ric passbands u' , g' , r' , i' and z' (coverin g the ultraviolet 
throug h to the near infrared) as defined in Fukugit a et al\ 
(1996) . The photome tric system is described in detail 
bvlSmith et all ll2002f) and the photometric monitor by 
iHogg et all 1)200 iL All data are reduced by an auto- 
mate d software pipeline ijLupton et aJ.ll2003t iPier et ail 
2002) and the outputs loaded in to a commercial SQL 



database. In our analysis, we will include data from runs 
with the longest contiguous scans (typically in excess of 
50 degrees). These data comprise a subset of the data 
that will be released to the public as part of data re- 
lease llDRl: Abazaiian et al. Il200l. In comp arison to 
the Early Data Release ( Stough ton et aiJl2002[) the area 
analyzed in this paper is approximately ^10 times larger, 
or approximately 20% of the entire survey area. 

Given the five band photometry from the SDSS 
imaging data we estimate the photometric redshifts 
of the galaxi es usin g the techniques outl i ned i n 
Budavari fit a?] fHM l2000>: IConnollv et all fl99^ : 
Csab aiet all ((200flfl . The details of the estimation 
techniques employed together with the expected un- 
certainties w i thin t he redshift estimators are given in 
ICsabai et all (|2002fl . In this paper we will just note the 
effective rms error of the photometric redshifts (typically 
Az rms = 0.04 at r* < 18). For all sources within a sample 
the redshift and its uncertainty are calculated together 
with a measure of the spectral type of the galaxy and its 
variance and covariance (with redshift). From these mea- 
sures we estimate the luminosity distance to each galaxy 
and calculate its r-band absolute magnitude. 

2.1. Building the Sample Database 

From the current photometric data in the SDSS 
archives we extract eight stripes for our analysis. These 
stripes combine to form approximately five coherent re- 
gions on the sky which range from approximately 90 de- 
grees to about 120 degrees in length. In the nomencla- 
ture of the SDSS these stripes are designated the num- 
bers 10-12, 35-37 and 76 and 86. The last two stripes 
come from the southern component of the survey. All 
data from these stripes have been designated as having 
survey quality photometric observations and astrometry. 
In total these stripes add up close to 20 million galax- 
ies and are accessibl e through the SDSS Science Archive 
l|Thakar et aJjEffll . 

Currently, there are two versions of the SDSS science 
archive running at Fermilab and remotely accessible to 
the collaboration. The "chunk" database contains stripes 
that have passed through the target selection process 
for identifying candidates for spectroscopic followup but 
with only photometry that was available when the spec- 
troscopic target selection was run on the region (i.e. pho- 
tometry for which the calibrations were not necessarily 
optimal). The "staging" database has the latest photo- 
metric data (for this paper we use the ver. 5.2.8 of the 
photometric pipeline) but without the full target selec- 
tion information. For our purposes the quality of the 
photometric measurements is important but the target 
selection is completely irrelevant and thus we take our 
sample from the staging database. 

In order to be able to efficiently store, search and se- 
lect galaxies from the catalog, we create a local database 
using Microsoft's SQL Server. The relevant properties 
(position, rcdshift, galaxy type, absolute and apparent 
magnitudes) of all galaxies in the staging database were 
stored in this database server. The regions of the sky 
surveyed by these data, the seeing of the observations 
as a function of position on the sky and the position of 
bright stars that must be masked out when defining the 
survey geometry were all calculated internally from this 
data set. 
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Fig. 1. — Density of galaxies in stripe 11. The image of the stripe is split in 4 to preserve the aspect ratio. The width of the stripe is 
2.5 degrees. The black pixels represent regions containing no objects. They are masked out due to bad field quality, seeing or bright stars. 
The narrowing of the stripe towards the left and right edge is due to the SDSS stripe geometry which ensures that objects are not counted 
twice. The pieces of the rectangle masked out towards the ends of the stripes are covered by adjacent stripes. 



From these data we restrict our sample to galaxies 
brighter than r* = 21. At this magnitude limit the star- 
galaxy separation is sufficiently accurate that it will not 
affect the angular clustering measures l|Scranton et all 
2003) and the phot ometric redshift err ors are typically 
less than a = 0.06 l)Csabai et all 12002ft . Applying this 
magnitude limit yields approximately 13 million galaxies 
from which to estimate the clustering signal. The sample 
was further restricted by excluding those regions of the 
stripes affected by the wings of bright stars or that were 
observed with poor seeing. Figure 12 shows the density of 
galaxies in stripe 11 with the masks over-plotted. Fields 
with seeing worse than 1.7" and a 3'x 3' neighborhood 
around all bright stars with r* < 14 were discarded. 

We note that these selections were all accomplished by 
applying a series of SQL queries to the database rather 
than progressively pruning a catalog of galaxies. The 
boundaries of the stripes, the seeing on a field-by-field ba- 
sis and all bright stars were stored in the local database, 
so that masks could be generated on the fly over the area 
to be analyzed. As such the selection criteria that were 
applied to the database could be optimized in a relatively 
short period of time. 

2.2. Clustering from a Volume Limited Sample 

Often the clustering evolution of galaxies, particularly 
that defined by angular clustering studies, is character- 
ized as a function of limiting magnitude. While observa- 
tionally this is simple to determine, the results of these 
analyzes are often difficult to interpret because in a mag- 
nitude limited sample the mix of the spectral types and 
absolute luminosities of galaxies is redshift dependent. 
We are, essentially, looking at the clustering properties 
of different types of galaxies as a function of limiting mag- 
nitude. The models to account for the clustering signal 
must, therefore, also be able to describe the evolution 
of the distribution of galaxies. Ideally we would sepa- 
rate out the effects of population mixing and study the 
evolution of angular clustering in terms of the intrinsic 
properties of galaxies (i.e. rest-frame color and luminos- 
ity), along with their distances. We can accomplish this 



if we use the photometric redshifts to select a volume 
limited sample of galaxies (i.e. one with a fixed absolute 
magnitude range as a function of redshift). 

The SDSS Earl y Data Release photometric red- 
shift c atalog bv iCsabai et all (I2002T) is based on tech- 
niques iBudayari et ailll999U2000HConnollv et aJ.lll999t 
Csabai et al. 2000) that estimate the physical parame- 
ters describing the galaxy samples in a self-consistent 
way. The relationship for the EDR data set is ap- 
plied to the photometric data selected from the imag- 
ing stripes and the spectral types, absolute magnitudes 
and k-corrections are stored within a database together 
with the photometric and positional information. All 
derived quantities assume an A cosmology with h = 0.7, 
Qm = 0.3 and 17a = 0.7. 

Figure |3 illustrates how the absolute magnitude limits 
in the r* band vary as a function of redshift. These ab- 
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Fig. 2. — Absolute r* magnitude vs. redshift relation as derived 
from the photometric redshift catalog. The dotted lines show our 
selection for the volume limited sample out to redshift 0.3. 
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solute magnitude boundaries are well denned, as a func- 
tion of redshift, by the galaxies within our sample. In 
this paper we study a volume-limited sample that ex- 
tends out to a redshift z = 0.3 with limiting absolute 
magnitude M r * = —19.97. We further restrict the data 
set to those galaxies more distant than redshift z = 0.1. 
The reason for this lower redshift limit is that the main 
spectroscopic galaxy sample will contain many of the low 
redshift galaxies, so using photometric redshift estimates 
is not necessary. Also the fractional error in z becomes 
large at lower redshifts due to uncertainty in the photo- 
tometric redshifts. The final catalog size is over 2 million 
galaxies. 

2.3. Rest-frame Selected Subsamples 

To analyze how angular clustering changes with lumi- 
nosity in Section 3, the volume limited sample is divided 
into 3 absolute magnitude bins. These subsamples repre- 
sented by Mi , M2 and M3 have limiting absolute magni- 
tudes M r * > -21, -21 > A/,.. > -22 and -22 > M r , > -23 
respectively. The size of these subsets decrease by ap- 
proximately a factor of two as a function of increasing 
luminosity. 

For the type dependent selection we utilize the con- 
tinuous spectral type parameter t from the photometric 
redshift estimation. This essentially encodes the rest- 
frame colors of the galaxies as there is a direct one-to- 
one mapping between the type t and the spectral energy 
distribution (SED) of the galaxy. The value t = rep- 
resents a te mplate galaxy that is as red as the elliptical 
spectrum of iColeman. Wu fc Weednianl l)1980[k as t in- 
creases the galaxy type becomes progressively later. In 
Section 3, we subdivide by spectral class breaking the 
luminosity classes into four subgroups (each with compa- 
rable numbers of galaxies). The cuts in the spectral type 
parameter t from red to blue are t < 0.02, 0.02 < t < 0.3, 
0.3 < t < 0.65 and t > 0.65. The cuts are defined as the 
Ti, T 2 , T 3 and T4 subsamples respectively. Our selec- 
tion is motivated by the sp e ctral energy distributions of 
IColeman. Wu fc Weedmanl l|1980j) . The first class con- 
sists of galaxies with SEDs similar to the CWW ellipti- 
cal template (Ell), the second, third and fourth classes 
contain a broader distribution of galaxy types approxi- 
mately corresponding to Sbc, Scd and Irr types, respec- 
tively. The distribution of types and our classification 
are shown in Figure and 01 

3. THE ANGULAR CORRELATION FUNCTION 

The properties of the angular correlation function and 
the estimators used to measure it from photometric cata- 
logs have been extensively discu ssed in the astronomical 
literature ijKerscher et ai.l20"f)fH) . The probability of find- 
ing galaxy within a solid angle 5f2 on the celestial plane 
of the sky at distance 9 f rom a randomly chosen object 
is given by l)PeeblesHl980]) 

6P = n [1 + w(6)} Sil, (1) 

where n is the mean number of objects per unit solid 
angle. The angular two-point correlation function w(9) 
basically gives the excess probability of finding an object 
compared to a uniform Poisson random point process. 

Traditionally the observations are compared to ran- 
dom catalogs that match the geometry of the survey. 
The computation usually consists of counting pairs of 
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Fig. 3. — The distribution of spectral type is bimodal. The small 
arrows hanging from the top axis illustrate our subsamples selected 
by spectral type. The cut between the 2 reddest and bluest classes 
at type t = 0.3 (equivalent to rest-frame u* — r* = 2) splits the 
distribution into 2 halves of similar sizes. 



objects drawn from the actual and random catalogs 
and applying a minimum varian c e estim ato r such as 
that defined by lLandv fc Szalavl (^993) or iHamiltonl 
( 19931). In this study, w e use the Landy-Szalay estimator 
( Landv fc Szalavi ri993) as 



WLS 



DD - 2DR + RR 
RR ' 



(2) 



where DD, RR and DR represent a count of the data- 
data, random-random and data-random pairs with 9 an- 
gular separation summed over the entire survey area. 

3.1. Estimating w{9) with a Fast Fourier Transform 

Even though distances on the sky are easy to compute 
mathematically, measuring the correlation function is not 
a trivial task, especially when it comes to large surveys. 
It's easy to see that any naive algorithm implementing 
the estimator in Eq. @ scales with the square of the 
number of objects N in the survey, 0(N 2 ). It, therefore, 
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Fig. 4. — Our classification and selection criteria are shown in 
this figure along with the number of galaxies in the subsamples. 
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Fig. 5. — The 2-dimensional correlation functions are shown 
for all 8 stripes on logarithmic scale. The correlation function is 
expected to be isotropic. Its sensitivity to artifacts in the data 
makes it an excellent diagnostic tool. The horizontal direction is 
the scan direction, along the stripe. The elongated streak at zero 
lag is related to the flat-field vector (see text). Note that the plots 
have parity symmetry through the origin, thus any half of the 2D 
space contains all the information for computing w(8). 



becomes progressively more expensive in computational 
power to apply these techniques to data the size of the 
SDSS samples. 

We propose a novel method to es timate w(9) using 
the Fast Fourier Transform (e.g. in iPress et all I1992L 
hereafter FFT), which scales as 0(N log N), a signifi- 
cant improvement over the naive approach. The princi- 
ple behind this is to group all galaxies into small cells 
within a grid and analyze this matrix instead of the 
point catalog. The implementation of this FFT ap- 
proach called eSpICE is a Euclidea n version of SpICE by 
iSzapudi. Prunet fc Colombl (|2001^) . A di scussion on its 
prope rties and scalings is given elsewhere (jSzaoudi et all 
2003). Here we present an application of eSpICE to the 
SDSS galaxy angular clustering and limit our discussion 
of the algorithm to only an outline of how the method 
works. 

To apply a standard FFT analysis to the problem we 
operate in Euclidean space. In other words, Euclidean 
distances are computed instead of the correct angular- 
separation. Given that we only examine separations of 
less than 2 degrees, the maximum relative distance error 
introduced by the use of the Euclidean approximation is 
only 0.00005. This has a negligible effect on the accu- 
racy of our analysis, and the use of Euclidean distances 
increases the speed of the algorithms significantly. We 



use SDSS survey coordinates (^, v) to define the posi- 
tion of an object within a stripe. These coordinates are 
defined locally for each stripe. As the individual stripes 
comprise great circles on a sphere the survey coordinates 
along these great circles are close to Euclidean (i.e. in 
this coordinate system every stripe looks as if it were 
equatorial). In fact for equatorial stripes 10 and 82, the 
(fi, v) coordinates are the same as (RA, Dec). Beyond 
the above data grid of galaxies, we need to describe the 
geometry of the survey area. This is done by a second 
grid, with the same dimensions that describes the sur- 
vey boundaries. We call this the window grid because 
it is 1 if the pixel is entirely inside the boundaries and 
otherwise. In this way we can also incorporate arbi- 
trarily complex masks (e.g. for excluding regions around 
bright stars) as they are simply applied to the window 
grid. The data and window grids are then padded with 
zeros up to the maximum angular scale to avoid aliasing 
in the Fast Fourier Transform. Finally, eSpICE is used 
to calculate the two-point correlation function directly. 

Our approach is to apply this algorithm to one stripe at 
a time. The data and window matrices are constructed 
by querying the SQL database to select the appropriate 
galaxies and masks. From these data we calculate the 
two-dimensional angular correlation function for all eight 
stripes. Figure [5] shows the 2D correlation function for 
all stripes used in this analysis. We find that the 2D cor- 
relation function is an extremely sensitive diagnostic tool 
for identifying systematics within the photometric data. 
The correlation function is expected to be isotropic. Ar- 
tifacts within the data or survey geometry distort this 
symmetry. This is seen, to varying degrees, in each of 
the eight 2D correlations functions. We find that within 
the 2D correlation function there is an elongated streak, 
at zero lag, along the scan direction which has structure 
on scales in excess of a few degrees. This arises due to 
errors in the flat-field vector. 

In a drift-scan survey the flat field is a one-dimensional 
vector (orthogonal to the direction of the scan). Errors 
within the flat field tend, therefore, to be correlated along 
the scan direction (i.e. along the columns of a stripe). 
This effect is seen within all of the individual stripe cor- 
relation functions. As it is, by definition, a zero lag ef- 
fect we can exclude it from the analysis by censoring this 
region of the 2D correlation function before azimuthally 
averaging the signal to get a ID correlation function. We 
note, however, that even if we do not censor the data to 
remove this effect the result of averaging the correlation 
function azimuthally (and the fact that this elongated 
streak only affects a small fraction of the 2D correlation 
function) the results discussed in the following sections 
are not affected. 

In total, the computation of the one-dimensional angu- 
lar correlation function w(9) takes less than 3 minutes for 
a stripe, which is several orders of magnitude faster than 
traditional two-point estimators. Having computed the 
correlation function for all stripes, we co-add the results 
by properly weighting with the number of galaxy pairs. 
At the same time the covariance of the signal is also esti- 
mated from the scatter between the different stripes Fig- 
urc|nishows the covariance matrix for the volume limited 
sample (z < 0.3). The image of the matrix is normalized 
so that the diagonal elements are always white and the 
(off-diagonal) gray scale values represent the correlation 
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Fig. 6. — The covariance matrix of the entire volume limited 
sample is normalized so that the diagonal elements are white and 
the grayscale values represent the correlations. 



between the bins. Because of the logarithmic sampling 
of w(6), the neighboring bins are farther apart and hence 
correlate progressively less as we go to larger scales and 
thus to larger bins. 

3.2. The Clustering Scale Length 

With the measured correlation amplitudes as function 
of angular separation in hand, we can obtain a paramet- 
ric form of the scaling. The amplitude and power of the 
correlation are calculated by fitting the usual formula, 



w{9) = A, 



-s 



(3) 



Normalizing 9 by 9a = 0.1° enables the amplitude A w be 
directly compared with the figures showing w(9) mea- 
surements; A w is essentially the value of the correlation 
function at 9q, since A w = w(9o). The parameters A w 
and 6 are estimated by minimizing the cost function 



X 2 (A W1 S) = 



N 



1,3=1 



3 ' 



(4) 



where Awi — w% — w{9i\A Wl 5), N is the number of bins 
and Cij is the covariance matrix. Although, x 2 (A w , S) is 
not quadratic in the parameter <5, it is in A w , thus the 
equation 

cV 

can be solved analytically for A w to reduce the dimen- 
sional ity of the problem. We t hen use a method bv lBrentJ 
l|1973|) (see lPress et aJ.lll992|) to search for the optimal 
value of 5. The \ 2 fit also provides information about the 
covariances of the estimated parameters that are shown 
as error ellipses in the next section. 

Figure [3 illustrates the angular clustering in our fidu- 
cial sample of all galaxies within the volume limited sam- 
ple. In the left panel, the correlation function w(9) is 
shown along with the best power-law fit. The range 
in angular separation varies from 1 arcminute to 2 de- 
grees. At a mean redshift for the volume-limited sample 



of z — 0.2, 2° corresponds to ~ 17/i _1 Mpc. The error 
bars on the measurements are computed as one over the 
square root of the diagonal elements of the covariance 
matrix (e.g. shown in Figure |5J). In the right panel of 
Figure Q we plot the best fit parameters and their er- 
ror ellipses. For the full volume-limited sample the best 
fit to the data has a slope of 8 = 0.84 ± 0.02 with an 
amplitude of A w = 0.078 ± 0.001. 

3.3. Clustering as a Function of Luminosity 

The angular clustering of the galaxies in the three lu- 
minosity bins described in Section 3 are compared in 
Figure |H1 As noted earlier, the left panel shows the 
correlation function, and the right panel gives the pa- 
rameters of the power-law fits. The first section (I) of 
Table [J] presents the values and errors on these measured 
parameters. As expected, the more luminous galaxies 
are clustered more strongly: the amplitudes are roughly 
larger by a factor of 1.5 from one sample to the next 
and are measured to be 0.061 ± 0.001, 0.092 ± 0.002 and 
0.138 ± 0.004. The slope of the correlation functions are 
consistent for all luminosity bins and with the fiducial 
value derived for the entire volume (the estimated slope 
parameters scatter around S — 0.84). 

3.4. Bimodality of w(9) as a Function of Rest-frame 

Color 

The type dependence of the correlation function is not 
as simple as that found for the luminosity classes. It 
is well known that differe nt types of galaxies ha ve dif- 
ferent clustering behavior l)Giovanelli et aJjll9 86L Red 
elliptical galaxies are more likely to be found in higher 
density regions than spirals. Here, we study the evo- 
lution of the angular correlation function with spectral 
type in two absolute magnitude ranges, M r * > —21 and 
— 21 > M r * > — 23. Both of these yield approximately 1 
million galaxies within our volume. Figure shows how 
the clustering changes as a function of spectral type in 
the lower luminosity sample. The 2 reddest classes of the 
galaxy population (T\ and T 2 ) are essentially indistin- 
guishable, their clustering is significantly stronger than 
for the other classes or our fiducial results. The bluest 2 
classes (T3 and T4) have approximately the same power- 
law exponents but with different amplitudes. Section II 
of Table summarizes the results of parameter fitting, 
which is also seen in the right panel of Figure |5J The 
higher luminosity classes show the same basic trends but 
with a stronger correlation amplitude. The results of the 
type dependence of power law fits to the high luminosity 
class are given in Figure UUIand Table g] 

4. THE CORRELATION LENGTH FOR GALAXIES 

4.1. Limber's Equation 

From the angular clustering we can derive the spatial 
correlatio n length, rn, g iven the redshift distribution of 
the data ([Peebles 1980). This is accomplished by inte- 
grating over the comoving coordinate along two lines of 
sight n and r 2 , separated by the angle 9, to calculate the 
projected angular correlations from the real-space corre- 
lation function £(r) = (r/ro)~ 7 , 



w{9) = 



r\$(ri)dri f r|$(r 2 )dr 2 



F( ri ) 



F(r 2 ) 



12- 



(6) 
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Fig. 7. — Angular clustering in the fiducial sample of galaxies in the volume limited sample (z < 0.3). The correlation function is shown 
(left panel) along with the best power-law fit using the formula w(8) = A w (9/0.1°)~ s (right panel). 
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Fig. 8. — The clustering strength changes as a function of absolute magnitude without change in the slope. The correlation function 
and the parameter fits are shown in the left and right panels, respectively. The straight line represents the best fit to the fiducial sample 
in Fig. □ 



where <£>(r) is the selection function and the factor 
F(r) = 1 — kr 2 accounts for the curvature of space. 13 
In the small angle approximation, the correlation func- 



tion £12 
(r/r o y 



£(r'( + r'2 — 2rir 2 cos#) becomes £12 = 
u 2 /r 2 ) 7 ^ 2 using variables 2r = r\ + r 2 



and u — r\ — r 2 . This simplifies the above integral, be- 
cause one can separate out the part that has u and arrive 
at 



w(6) 



$ 2 (r) 



F 2 {r) 



dr. 



(7) 



13 In our case F(r) would be constant 1 because the parameters 
Om = 0.3 and Q\ = 0.7 assume a flat universe, however, we will 



do the integral in redshift (see later). 



where 



H j = 



r(i) r( V) 



r(i) 



(8) 



We can rewrite the integral using redshift z and its dis- 
tribution. J 
integral of 



tribution. Substituting r 2 j;dr = ^-dz one gets the final 



,{9) 



'0 IJ 7 



.1-7 



dN 

dz 



dz (9) 



that may be compared to the measured quantities di- 
rectly. We have assumed that the selection function and 
the redshift distribution are normalized to J ^dz = 1, 
which can be easily achieved numerically. 
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4.2. Estimating dN/dz 

To determine the correlation length, we need to esti- 
mate dN/dz. Given a photometric redshift and its ran- 
dom error, there is a conditional probability of having a 
galaxy at a certain true redshift. In addition, we need 
to incorporate the apparent magnitude limit and photo- 
metric redshift selection criteria in the estimate of the 
redshift distribution. The real redshift z and the photo- 
metric redshift s of a galaxy are different. We assume 
that s = z + v where v is the error and drawn from a 
normal distribution. Thus 



where a determines the precision of the estimates. We 
need the inverse: what is the true redshift, given the 
photometric estimate. This may be obtained from Bayes' 
theorem, 

P{z)P{s\z) 
P{z\s) = p7^ , (11) 

where P(z) is the true redshift distribution calculated 
from the LF that also depends on the apparent magni- 
tude cuts. 

Our volume limited sample is selected by a window 
function of photometric redshifts, 



P(s\z) = 



exp 



2a 2 



(10) 



W(s) = 



1 if s between 0.1 and 0.3, 
otherwise. 



(12) 
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Fig. 11. — The redshift distribution is computed by adding the 
contributions of narrow r* intervals. The top panel (a) shows the 
number density of galaxies with r* between 18 and 18.2 as func- 
tion of redshift, as derived from the luminosity function. The black 
shaded curve is the corresponding window function with a = 0.05. 
The product of these two, shown in panel (b), is the effective contri- 
bution of the magnitude bin. Panel (c) illustrates the final redshift 
histogram, which is the properly weighted sum of the products like 
the one above. 



The conditional probability of having a galaxy in a sam- 
ple selected by this window function is calculated by the 
integral over the distribution P(s), 

P{z\W) = JdsP(s)W{s)P(z\s) (13) 

= P{z) JdsW(s)P(s\z) (14) 

= P(z)W„(z), (15) 

where W a (z) is the photometric redshift selection func- 
tion convolved with the photometric redshift uncertainty. 

The precision of photometric redshifts is a strong func- 
tion of the apparent magnitude. From the photomet- 
ric redshift catalog, we compute the mean redshift er- 
rors ai and the galaxy counts (for proper weighting) 
in Ar* = 0.2 wide magnitude bins. Using the LF by 
IBlanton et ail pOO^ . we derive the redshift distributions 
Pi(z) for the same magnitude bins up to r* = 21, which is 
the limiting magnitude in the sample. The final redshift 
distribution is the weighted average of these probabili- 
ties, 

P(z\W) <xJ2 n i p i( z )W*M- (16) 

i 

The top panel of Figure ITT1 illustrates the redshift distri- 
bution derived from the LF for galaxies with 18 <r*<18.2 
and a smoothed window W a (z) with a = 0.05. The mid- 
dle panel shows the effective contribution of this magni- 
tude bin to the final dN/dz, which is plotted in the last 
panel. 

4.3. Results for ro 

Applying the above probabilistic redshift distribution 
and substituting the power-law fits derived earlier, we 



find a correlation length for the full volume limited sam- 
ple of 5.77 7i _1 Mpc. Estimating the uncertainty of this 
value is not trivial. The statistical errors on the para- 
metric fits to w(9) are known and may be used to es- 
timate the errors on the correlation length. These are 
estimated by calculating the scatter of the predicted ro 
measurements for 20,000 Monte-Carlo realizations of the 
fitting parameters, A w and 5, based on their covariancc 
matrix. For the full volume limited sample, we obtain 
A rms = 0.05 h M.pc. However, the uncertainty on ro 
is also affected by the uncertainty in the redshift dis- 
tribution. The primary source of change in the dN/dz 
is the uncertainty in the LF parameters. We com- 
pute the partial derivatives (dro/da), (dro/dM*) and 
(d rp/dQ) numeri c ally a nd propagate the quoted errors 
of IBlanton et aJJ l)2003|) . We find that the evolution- 
ary parameter Q makes the largest difference, the ro 
errors from the uncertainty of M* and a are negligi- 
ble. For the fiducial ro value, we estimate an error of 
A LF = 0.09 /i _1 Mpc. The dependence of a on the ap- 
parent magnitude was determined empirically using the 
actual measurements, thus the results are not affected by 
Malmquist bias. 

In Figure^l the correlation length is plotted as a func- 
tion of luminosity (left panel) and SED type (right panel). 
The relation between ro and luminosity and spectral type 
are consistent with that observed directly from the an- 
gular data and from measu res of the clustering length 
from spectroscopic surveys l)Zehavi et aJJ 12002^ . Over 
an absolute magnitude range of —19.97 to —22, ro in- 
creases with luminosity from a value of 5.04 ± 0.09 to 
7.87±0.24 /t ~ 1 Mpc, which is con sistent with the increase 
observed bv lZehavi et all 1)2003) . 

The color dependence of ro (i.e. clustering as a function 
of spectral type) shows the expected increase in cluster- 
ing length for early type galaxies. The values of the cor- 
relation length for the spectral type subsamples are given 
in Table ^ In the lower luminosity bin red, T\, galaxies 
have a correlation length of 6.59 ± 0.17 /i _1 Mpc for the 
M r » > —21 sample and the bluest galaxies, T4, have a 
correlation length of 4.51 ± 0.19 /i _1 Mpc. The trend in 
this relation is again consistent with the observed depen- 
dence of the correlation length as a function o f spectral 
and morphological type l|Ciovanelli et ai.lll986D . 

5. DISCUSSION 

The interesting aspect of the luminosity and color de- 
pendent clustering is not that the observed clustering 
length scales with luminosity and color as this has been 
demonstrated from many different surveys. It arises from 
how the shape of the correlation function depends on lu- 
minosity and color. It is remarkable that the luminosity 
simply effects the amplitude of the correlation function 
and not the slope, whereas the type selection affects both 
the slope and amplitude. This is particularly intriguing 
when we note that we would expect an intrinsic correla- 
tion between the luminosity and spectral type of a galaxy. 

Observationally, early type galaxies tend to reside in 
clusters of galaxies whereas later type galaxies are more 
often found in the field. We would expect, therefore, that 
early type galaxy samples would have more small sepa- 
ration pairs than samples selected for late type galaxies 
and that their resulting correlation functions would be 
steeper. This is consistent with the data except that we 
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Fig. 12. — The correlation length ro is plotted as a function of luminosity (left) and spectral type (right panel). The error bars include 
both error terms added in quadrature. 



would expect there to be a smooth transition from early 
to late type galaxies and that the correlation function 
slope should smoothly change from a steep value of 0.96 
for the early types to the more shallow value of 0.68 for 
the late types. What we observe, however, is that red 
galaxies (types T\ and T 2 ) have a common slope of ~ 0.1 
and blue galaxies (types T3 and T4) have a slope of 0.07 
(i.e. there does not appear to be a smooth transition). 

We can, however, explain this behavior if we consider a 
simple model for the distribution of galaxy types. From 
Figure|3|we see t hat the type histogram for the gala xies is 
almost bimodal l)Hogg et aJJl2003HStrateva et all 120*01 
with the distribution being well fit by two Gaussians. For 
simplicity we will denote these subclasses as "red" and 
"blue" . If the "red" and "blue" populations have distinct 
correlation functions (i.e. with different slopes), then any 
observed correlation function should simply come from 
mixing these populations. As we change the mix of "red" 
and "blue" galaxies then the resulting slope of the corre- 
lation function will also change. This is exactly what we 
observe with the correlation function as we move from 
the T 2 to T 3 selections. 

If we selected galaxies only from either the "red" or 
"blue" sub-populations we would expect no change in 
the correlation function slope as all of the "red" or "blue" 
galaxies have a common correlation function. Again, this 
what we observe from the data. The slopes of the correla- 
tion functions for the T\ and T 2 red samples are identical 
as are the correlation functions for the blue T3 and T4 
samples. In reality, the color selection that we applied 
to the SDSS data (types T\ through T 4 ) was not chosen 
to optimally separate two distinct populations of galax- 
ies but rather to provide a simple subdivision of galaxies 
based on the CWW spectral energy distributions. We 
might expect there to remain some population mixing in 
our Ti, T 2 , T 3 and T4 color cuts. We observe this effect 
where the amplitude of the correlation function for the 
T3 and T4 classes are close but not identical; this would 
imply that the T3 still contains a subset of the "red" 



galaxies. 

The luminosity dependence can be explained if we 
note that the luminosity functions of the "red" and 
"blue" classe s are identical for m agnitudes brighter than 
M r * = -20 (jBaldrv et aH l2~003). They deviate only for 
the faint end of the luminosity function ( "blue" galaxies 
having a steeper faint end slope) . Varying the luminosity 
cuts should not change the mix of the galaxy populations 
(unless we sample galaxies with M r * > —20). We would, 
therefore, expect the shape of the correlation function to 
be independent of the luminosity cuts (as is found from 
the data). Given this hypothesis if we selected a volume 
limited sample for galaxies with luminosities less that 
M r * = —20 we would expect to find a dependence on the 
slope with luminosity. 

It is, therefore, remarkable that with such a simple 
model for the distribution of galaxies (i.e. just two classes 
with differing correlation functions) we can qualitatively 
describe the behavior of the correlation functions with 
color and luminosity. What is difficult to understand is 
why there would be a simple scaling of the amplitude of 
the correlation function with intrinsic luminosity as the 
spatial scales we sample are in the non-linear regime (i.e. 
a simple linear bias model is not necessarily appropriate). 
Identifying the physical mechanism that could give rise to 
the luminosity and type dependent bias that we observe 
remains an open question. 
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Table 1. Power-law fits to correlation functions and correlation 

LENGTHS 



Sample 



Luminosity 



SED type 



iVt 



Fiducial 
T 



All 

M r * > -21 

-21>Af r . >-22 
-22>M r » >-23 



All 
"ATT 
All 
All 



2,016 
1,098 
650 
268 



0.078 
"TTUBT 
0.092 
0.138 
UTTJ 
0.109 
0.072 
0.047 
HOT 
0.155 
0.093 
0.067 



± 0.001 
± 0.001 
± 0.002 
± 0.004 
± 0.004 
± 0.005 
± 0.008 
± 0.003 
± 0.007 
± 0.005 
±0.010 
± 0.010 



0.84 ±0.02 
0.84 ±0.03 
0.85 ±0.04 
0.84 ±0.05 
0.96 ±0.05 
1.02 ±0.06 
0.66 ±0.12 
0.68 ±0.09 
0.96 ±0.05 
0.87 ±0.04 
0.70 ±0.15 
0.72 ±0.17 



5.77 ±0. 
5.04 ±0 
6.26 ±0. 
7.87 ±0. 
6.59 ±0 
6.28 ±0 

5.90 ±0. 

4.51 ±0 

7.91 ±0 
8.21 ±0 
6.75 ±0 

5.52 ±0 



05 ± 0.09 
05 ± 0.08 
08 ±0.10 
21 ±0.12 
14 ±0.10 
24 ±0.09 
51 ±0.10 

18 ± 0.07 
36 ±0.12 

19 ±0.12 
56 ±0.11 
59 ± 0.09 



M r , > -21 
M r * > -21 
M r « > -21 
M r « > -21 



1 < 0.02 

0.02 < t < 0.3 
0.3 < t < 0.65 
t > 0.65 



"33T" 

254 
185 
316 



-21>M r * >-23 
-21>Af r . >-23 
-21>M r . >-23 
-21>M r . >-23 



1 < 0.02 

0.02 < t < 0.3 
0.3 < t < 0.65 
t > 0.65 



~2BD~ 

326 
185 
127 



t Number of galaxies in subsample (xlO 3 ) 

t Correlation length in /i — ^^Mpc. The two estimates are the statistical error from the power-law fits and the error from the uncertainty of 
the luminosity function parameters. 
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